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Abstract 


This document discusses both use cases and operational experience 
with Multipath TCP (MPTCP) in real networks. It lists several 
prominent use cases where Multipath TCP has been considered and is 


being used. It also gives insight to some heuristics and decisions 
that have helped to realize these use cases and suggests possible 
improvements. 


Status of This Memo 


This document is not an Internet Standards Track specification; it is 
published for informational purposes. 


This document is a product of the Internet Engineering Task Force 


(IETF). It represents the consensus of the IETF community. It has 
received public review and has been approved for publication by the 
Internet Engineering Steering Group (IESG). Not all documents 


approved by the IESG are a candidate for any level of Internet 
Standard; see Section 2 of RFC 7841. 


Information about the current status of this document, any errata, 


and how to provide feedback on it may be obtained at 
http://www.rfc-editor.org/info/rfc8041. 
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Les 


Introduction 


Multipath TCP was specified in [RFC6824] and five independent 
implementations have been developed. As of November 2016, Multipath 
TCP has been or is being implemented on the following platforms: 


o Linux kernel [MultipathTCP-Linux] 

O Apple iOS and macos 

o Citrix load balancers 

o FreeBSD [FreeBSD-MPTCP] 

o Oracle Solaris 

The first three implementations are known to interoperate. Three of 
these implementations are open source (Linux kernel, FreeBSD and 
Apple's iOS and macOS). Apple's implementation is widely deployed. 
Since the publication of [RFC6824] as an Experimental RFC, experience 
has been gathered by various network researchers and users about the 
operational issues that arise when Multipath TCP is used in today's 
Internet. 

When the MPTCP working group was created, several use cases for 
Multipath TCP were identified [RFC6182]. Since then, other use cases 


have been proposed and some have been tested and even deployed. We 
describe these use cases in Section 2. 


Section 3 focuses on the operational experience with Multipath TCP. 
Most of this experience comes from the utilization of the Multipath 
TCP implementation in the Linux kernel [MultipathTCP-Linux]. This 
open-source implementation has been downloaded and implemented by 
thousands of users all over the world. Many of these users have 
provided direct or indirect feedback by writing documents (scientific 
articles or blog messages) or posting to the mptcp-dev mailing list 


(see https://listes-2.sipr.ucl.ac.be/sympa/arc/mptcp-dev). This 
Multipath TCP implementation is actively maintained and continuously 
improved. It is used on various types of hosts, ranging from 


smartphones or embedded routers to high-end servers. 


The Multipath TCP implementation in the Linux kernel is not, by far, 
the most widespread deployment of Multipath TCP. Since September 
2013, Multipath TCP is also supported on smartphones and tablets 
beginning with iOS7 [IETFJ]. There are likely hundreds of millions 
of MPTCP-enabled devices. This Multipath TCP implementation is 
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currently only used to support the Siri voice recognition/control 


application. Some lessons learned from this deployment are described 
in [IETFJ]. 
Section 3 is organized as follows. Supporting the middleboxes was 


one of the difficult issues in designing the Multipath TCP protocol. 
We explain in Section 3.1 which types of middleboxes the Linux Kernel 
implementation of Multipath TCP supports and how it reacts upon 
encountering these. Section 3.2 summarizes the MPTCP-specific 
congestion controls that have been implemented. Sections 3.3 to 3.7 
discuss heuristics and issues with respect to subflow management as 
well as the scheduling across the subflows. Section 3.8 explains 
some problems that occurred with subflows having different Maximum 
Segment Size (MSS) values. Section 3.9 presents issues with respect 
to content delivery networks and suggests a solution to this issue. 
Finally, Section 3.10 documents an issue with captive portals where 
MPTCP will behave suboptimally. 


2. Use Cases 
Multipath TCP has been tested in several use cases. There is already 
an abundant amount of scientific literature on Multipath TCP 
[MPTCPBIB]. Several of the papers published in the scientific 
literature have identified possible improvements that are worth being 
discussed here. 


2.1. Datacenters 


A first, although initially unexpected, documented use case for 


Multipath TCP has been in datacenters [HotNets][SIGCOMM11]. Today's 
datacenters are designed to provide several paths between single- 
homed servers. The multiplicity of these paths comes from the 


utilization of Equal-Cost Multipath (ECMP) and other load-balancing 
techniques inside the datacenter. Most of the deployed load- 
balancing techniques in datacenters rely on hashes computed over the 
five tuple. Thus, all packets from the same TCP connection follow 
the same path: so they are not reordered. The results in [HotNets] 
demonstrate by simulations that Multipath TCP can achieve a better 
utilization of the available network by using multiple subflows for 
each Multipath TCP session. Although [RFC6182] assumes that at least 
one of the communicating hosts has several IP addresses, [HotNets] 
demonstrates that Multipath TCP is beneficial when both hosts are 
single-homed. This idea is analyzed in more details in [SIGCOMM11], 
where the Multipath TCP implementation in the Linux kernel is 
modified to be able to use several subflows from the same IP address. 
Measurements in a public datacenter show the quantitative benefits of 
Multipath TCP [SIGCOMM11] in this environment. 
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Although ECMP is widely used inside datacenters, this is not the only 
environment where there are different paths between a pair of hosts. 
ECMP and other load-balancing techniques such as Link Aggregation 
Groups (LAGs) are widely used in today's networks; having multiple 
paths between a pair of single-homed hosts is becoming the norm 
instead of the exception. Although these multiple paths often have 
the same cost (from an IGP metrics viewpoint), they do not 
necessarily have the same performance. For example, [IMC13c] reports 
the results of a long measurement study showing that load-balanced 
Internet paths between that same pair of hosts can have huge delay 
differences. 


2.2. Cellular/WiFi Offload 


A second use case that has been explored by several network 
researchers is the cellular/WiFi offload use case. Smartphones or 
other mobile devices equipped with two wireless interfaces are a very 
common use case for Multipath TCP. In September 2015, this is also 
the largest deployment of MPTCP-enabled devices [IETFJ]. It has been 
briefly discussed during IETF 88 [IETF88], but there is no published 
paper or report that analyses this deployment. For this reason, we 
only discuss published papers that have mainly used the Multipath TCP 
implementation in the Linux kernel for their experiments. 


The performance of Multipath TCP in wireless networks was briefly 
evaluated in [NSDI12]. One experiment analyzes the performance of 
Multipath TCP on a client with two wireless interfaces. This 
evaluation shows that when the receive window is large, Multipath TCP 
can efficiently use the two available links. However, if the window 
becomes smaller, then packets sent on a slow path can block the 
transmission of packets on a faster path. In some cases, the 
performance of Multipath TCP over two paths can become lower than the 
performance of regular TCP over the best performing path. Two 
heuristics, reinjection and penalization, are proposed in [NSDI12] to 
solve this identified performance problem. These two heuristics have 
since been used in the Multipath TCP implementation in the Linux 
kernel.  [CONEXT13] explored the problem in more detail and revealed 
some other scenarios where Multipath TCP can have difficulties in 
efficiently pooling the available paths. Improvements to the 
Multipath TCP implementation in the Linux kernel are proposed in 
[CONEXT13] to cope with some of these problems. 


The first experimental analysis of Multipath TCP in a public wireless 
environment was presented in [Cellnet12]. These measurements explore 
the ability of Multipath TCP to use two wireless networks (real WiFi 
and 3G networks). Three modes of operation are compared. The first 
mode of operation is the simultaneous use of the two wireless 
networks. In this mode, Multipath TCP pools the available resources 
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and uses both wireless interfaces. This mode provides fast handover 
from WiFi to cellular or the opposite when the user moves. 
Measurements presented in [CACM14] show that the handover from one 
wireless network to another is not an abrupt process. When a host 
moves, there are regions where the quality of one of the wireless 
networks is weaker than the other, but the host considers this 
wireless network to still be up. When a mobile host enters such 
regions, its ability to send packets over another wireless network is 
important to ensure a smooth handover. This is clearly illustrated 
from the packet trace discussed in [CACM14]. 


Many cellular networks use volume-based pricing; users often prefer 
to use unmetered WiFi networks when available instead of metered 
cellular networks. [Cellnet12] implements support for the MP PRIO 
option to explore two other modes of operation. 


In the backup mode, Multipath TCP opens a TCP subflow over each 
interface, but the cellular interface is configured in backup mode. 
This implies that data flows only over the WiFi interface when both 
interfaces are considered to be active. If the WiFi interface fails, 
then the traffic switches quickly to the cellular interface, ensuring 
a smooth handover from the user's viewpoint [Cellnet12]. The cost of 
this approach is that the WiFi and cellular interfaces are likely to 
remain active all the time since all subflows are established over 
the two interfaces. 


The single-path mode is slightly different. This mode benefits from 
the break-before-make capability of Multipath TCP. When an MPTCP 
session is established, a subflow is created over the WiFi interface. 
No packet is sent over the cellular interface as long as the WiFi 
interface remains up [Cellnet12]. This implies that the cellular 
interface can remain idle and battery capacity is preserved. When 
the WiFi interface fails, a new subflow is established over the 
cellular interface in order to preserve the established Multipath TCP 
sessions. Compared to the backup mode described earlier, 
measurements reported in [Cellnet12] indicate that this mode of 
operation is characterized by a throughput drop while the cellular 
interface is brought up and the subflows are reestablished. 


From a protocol viewpoint, [Cellnet12] discusses the problem posed by 
the unreliability of the REMOVE ADDR option and proposes a small 
protocol extension to allow hosts to reliably exchange this option. 
It would be useful to analyze packet traces to understand whether the 
unreliability of the REMOVE ADDR option poses an operational problem 
in real deployments. 
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Another study of the performance of Multipath TCP in wireless 
networks was reported in [IMC13b]. This study uses laptops connected 
to various cellular ISPs and WiFi hotspots. It compares various file 
transfer scenarios. [IMC13b] observes that 4-path MPTCP outperforms 
2-path MPTCP, especially for larger files. However, for three 
congestion-control algorithms (LIA, OLIA, and Reno -- see 

Section 3.2), there is no significant performance difference for file 
sizes smaller than 4 MB. 


A different study of the performance of Multipath TCP with two 
wireless networks is presented in [INFOCOM14]. In this study the two 
networks had different qualities: a good network and a lossy network. 
When using two paths with different packet-loss ratios, the Multipath 
TCP congestion-control scheme moves traffic away from the lossy link 
that is considered to be congested. However, [INFOCOM14] documents 
an interesting scenario that is summarized hereafter. 


client:;esse------9-— pathi s-----25- server 


Figure 1: Simple network topology 


Initially, the two paths in Figure 1 have the same quality and 
Multipath TCP distributes the load over both of them. During the 
transfer, the path2 becomes lossy, e.g., because the client moves. 
Multipath TCP detects the packet losses and they are retransmitted 
over pathi. This enables the data transfer to continue over this 
path. However, the subflow over path2 is still up and transmits one 
packet from time to time. Although the N packets have been 
acknowledged over the first subflow (at the MPTCP level), they have 
not been acknowledged at the TCP level over the second subflow. To 
preserve the continuity of the sequence numbers over the second 
subflow, TCP will continue to retransmit these segments until either 
they are acknowledged or the maximum number of retransmissions is 
reached. This behavior is clearly inefficient and may lead to 
blocking since the second subflow will consume window space to be 
able to retransmit these packets. [INFOCOM14] proposes a new 
Multipath TCP option to solve this problem. In practice, a new TCP 
option is probably not required. When the client detects that the 
data transmitted over the second subflow has been acknowledged over 
the first subflow, it could decide to terminate the second subflow by 
sending a RST segment. If the interface associated to this subflow 
is still up, a new subflow could be immediately reestablished. It 
would then be immediately usable to send new data and would not be 
forced to first retransmit the previously transmitted data. As of 
this writing, this dynamic management of the subflows is not yet 
implemented in the Multipath TCP implementation in the Linux kernel. 
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Some studies have started to analyze the performance of Multipath TCP 
on smartphones with real applications. In contrast with the bulk 
transfers that are used by many publications, many deployed 
applications do not exchange huge amounts of data and mainly use 
small connections. [COMMAG2016] proposes a software testing 
framework that allows to automate Android applications to study their 
interactions with Multipath TCP.  [PAM2016] analyses a one-month 
packet trace of all the packets exchanged by a dozen of smartphones 
utilized by regular users. This analysis reveals that short 
connections are important on smartphones and that the main benefit of 
using Multipath TCP on smartphones is the ability to perform seamless 
handovers between different wireless networks. Long connections 
benefit from these handovers. 


2.3. Multipath TCP Proxies 


As Multipath TCP is not yet widely deployed on both clients and 
servers, several deployments have used various forms of proxies. Two 
families of solutions are currently being used or tested. 


A first use case is when an MPTCP-enabled client wants to use several 
interfaces to reach a regular TCP server. A typical use case is a 
smartphone that needs to use both its WiFi and its cellular interface 
to transfer data. Several types of proxies are possible for this use 
case. An HTTP proxy deployed on a MPTCP-capable server would enable 
the smartphone to use Multipath TCP to access regular web servers. 
Obviously, this solution only works for applications that rely on 
HTTP. Another possibility is to use a proxy that can convert any 
Multipath TCP connection into a regular TCP connection.  MPTCP- 
specific proxies have been proposed [HotMiddlebox13b] [HAMPEL]. 


Another possibility leverages the SOCKS protocol [RFC1928]. SOCKS is 
often used in enterprise networks to allow clients to reach external 
servers. For this, the client opens a TCP connection to the SOCKS 
server that relays it to the final destination. If both the client 
and the SOCKS server use Multipath TCP, but not the final 
destination, then Multipath TCP can still be used on the path between 
the clients and the SOCKS server. At IETF 93, Korea Telecom 
announced that they have deployed (in June 2015) a commercial service 
that uses Multipath TCP on smartphones. These smartphones access 
regular TCP servers through a SOCKS proxy. This enables them to 
achieve throughputs of up to 850 Mbps [KT]. 
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Measurements performed with Android smartphones [Mobicom15] show that 
popular applications work correctly through a SOCKS proxy and MPTCP- 
enabled smartphones. Thanks to Multipath TCP, long-lived connections 
can be spread over the two available interfaces. However, for short- 
lived connections, most of the data is sent over the initial subflow 
that is created over the interface corresponding to the default route 
and the second subflow is almost not used [PAM2016]. 


A second use case is when Multipath TCP is used by middleboxes, 
typically inside access networks. Various network operators are 
discussing and evaluating solutions for hybrid access networks 
[TR-348]. Such networks arise when a network operator controls two 
different access network technologies, e.g., wired and cellular, and 
wants to combine them to improve the bandwidth offered to the end 
users [HYA-ARCH]. Several solutions are currently investigated for 
such networks [TR-348]. Figure 2 shows the organization of such a 
network. When a client creates a normal TCP connection, it is 
intercepted by the Hybrid CPE (HPCE) that converts it in a Multipath 
TCP connection so that it can use the available access networks (DSL 


and LTE in the example). The Hybrid Access Gateway (HAG) does the 
opposite to ensure that the regular server sees a normal TCP 
connection. Some of the solutions currently discussed for hybrid 


networks use Multipath TCP on the HCPE and the HAG. Other solutions 
rely on tunnels between the HCPE and the HAG [GRE-NOTIFY]. 


client --- HCPE ------ DSL ------- HAG --- internet --- server 


Figure 2: Hybrid Access Network 
3. Operational Experience 
3.1. Middlebox Interference 
The interference caused by various types of middleboxes has been an 
important concern during the design of the Multipath TCP protocol. 


Three studies on the interactions between Multipath TCP and 
middleboxes are worth discussing. 
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The first analysis appears in [IMC11]. This paper was the main 
motivation for Multipath TCP incorporating various techniques to cope 
with middlebox interference. More specifically, Multipath TCP has 
been designed to cope with middleboxes that: 


O change source or destination addresses 

o change source or destination port numbers 

o change TCP sequence numbers 

o split or coalesce segments 

o remove TCP options 

o modify the payload of TCP segments 

These middlebox interferences have all been included in the MBtest 
suite [MBTest]. This test suite is used in [HotMiddlebox13] to 
verify the reaction of the Multipath TCP implementation in the Linux 
kernel [MultipathTCP-Linux] when faced with middlebox interference. 
The test environment used for this evaluation is a dual-homed client 
connected to a single-homed server. The middlebox behavior can be 


activated on any of the paths. The main results of this analysis 
are: 


o the Multipath TCP implementation in the Linux kernel is not 
affected by a middlebox that performs NAT or modifies TCP sequence 
numbers 


o when a middlebox removes the MP CAPABLE option from the initial 
SYN segment, the Multipath TCP implementation in the Linux kernel 
falls back correctly to regular TCP 


o when a middlebox removes the DSS option from all data segments, 
the Multipath TCP implementation in the Linux kernel falls back 
correctly to regular TCP 


o when a middlebox performs segment coalescing, the Multipath TCP 
implementation in the Linux kernel is still able to accurately 
extract the data corresponding to the indicated mapping 


o when a middlebox performs segment splitting, the Multipath TCP 
implementation in the Linux kernel correctly reassembles the data 
corresponding to the indicated mapping. [HotMiddlebox13] shows, 
in Figure 4 in Section 3.3, a corner case with segment splitting 
that may lead to a desynchronization between the two hosts. 
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The interactions between Multipath TCP and real deployed middleboxes 
are also analyzed in [HotMiddlebox13]; a particular scenario with the 
FTP Application Level Gateway running on a NAT is described. 


Middlebox interference can also be detected by analyzing packet 
traces on MPTCP-enabled servers. A closer look at the packets 
received on the multipath-tcp.org server [TMA2015] shows that among 
the 184,000 Multipath TCP connections, only 125 of them were falling 
back to regular TCP. These connections originated from 28 different 
client IP addresses. These include 91 HTTP connections and 34 FTP 
connections. The FTP interference is expected since Application 
Level Gateways used for FTP modify the TCP payload and the DSS 
Checksum detects these modifications. The HTTP interference appeared 
only on the direction from server to client and could have been 
caused by transparent proxies deployed in cellular or enterprise 
networks. A longer trace is discussed in [COMCOM2016] and similar 
conclusions about the middlebox interference are provided. 


From an operational viewpoint, knowing that Multipath TCP can cope 
with various types of middlebox interference is important. However, 
there are situations where the network operators need to gather 
information about where a particular middlebox interference occurs. 
The tracebox software [tracebox] described in [IMC13a] is an 
extension of the popular traceroute software that enables network 
operators to check at which hop a particular field of the TCP header 
(including options) is modified. It has been used by several network 
operators to debug various middlebox interference problems. 
Experience with tracebox indicates that supporting the ICMP extension 
defined in [RFC1812] makes it easier to debug middlebox problems in 
IPv4 networks. 


Users of the Multipath TCP implementation have reported some 
experience with middlebox interference. The strangest scenario has 
been a middlebox that accepts the Multipath TCP options in the SYN 
segment but later replaces Multipath TCP options with a TCP EOL 
option [StrangeMbox]. This causes Multipath TCP to perform a 
fallback to regular TCP without any impact on the application. 


3.2. Congestion Control 


Congestion control has been an important challenge for Multipath TCP. 
The coupled congestion-control scheme defined in [RFC6356] in an 
adaptation of the NewReno algorithm. A detailed description of this 
coupled algorithm is provided in [NSDI11]. It is the default scheme 
in the Linux implementation of Multipath TCP, but Linux supports 
other schemes. 
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The second congestion-control scheme is OLIA [CONEXT12]. It is also 
an adaptation of the NewReno single path congestion-control scheme to 
support multiple paths. Simulations [CONEXT12] and measurements 
[CONEXT13] have shown that it provides some performance benefits 
compared to the default coupled congestion-control scheme. 


The delay-based scheme proposed in [ICNP12] has also been ported to 
the Multipath TCP implementation in the Linux kernel. It has been 
evaluated by using simulations [ICNP12] and measurements [PaaschPhD]. 


BALIA, defined in [BALIA], provides a better balance between TCP 
friendliness, responsiveness, and window oscillation. 


These different congestion-control schemes have been compared in 
several articles. [CONEXT13] and [PaaschPhD] compare these 
algorithms in an emulated environment. The evaluation showed that 
the delay-based congestion-control scheme is less able to efficiently 
use the available links than the three other schemes. 


3.3. Subflow Management 


The multipath capability of Multipath TCP comes from the utilization 
of one subflow per path. The Multipath TCP architecture [RFC6182] 
and the protocol specification [RFC6824] define the basic usage of 
the subflows and the protocol mechanisms that are required to create 
and terminate them. However, there are no guidelines on how subflows 
are used during the lifetime of a Multipath TCP session. Most of the 
published experiments with Multipath TCP have been performed in 
controlled environments. Still, based on the experience running them 
and discussions on the mptcp-dev mailing list, interesting lessons 
have been learned about the management of these subflows. 


From a subflow viewpoint, the Multipath TCP protocol is completely 
symmetrical. Both the clients and the server have the capability to 
create subflows. However, in practice, the existing Multipath TCP 
implementations have opted for a strategy where only the client 
creates new subflows. The main motivation for this strategy is that 
often the client resides behind a NAT or a firewall, preventing 
passive subflow openings on the client. Although there are 
environments such as datacenters where this problem does not occur, 
as of this writing, no precise requirement has emerged for allowing 
the server to create new subflows. 
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3.4. Implemented Subflow Managers 


The Multipath TCP implementation in the Linux kernel includes several 
Strategies to manage the subflows that compose a Multipath TCP 
session. The basic subflow manager is the full-mesh. As the name 
implies, it creates a full-mesh of subflows between the communicating 
hosts. 


The most frequent use case for this subflow manager is a multihomed 


client connected to a single-homed server. In this case, one subflow 
is created for each interface on the client. The current 
implementation of the full-mesh subflow manager is static. The 


subflows are created immediately after the creation of the initial 
subflow. If one subflow fails during the lifetime of the Multipath 
TCP session (e.g., due to excessive retransmissions or the loss of 
the corresponding interface), it is not always reestablished. There 
is ongoing work to enhance the full-mesh path manager to deal with 
such events. 


When the server is multihomed, using the full-mesh subflow manager 
may lead to a large number of subflows being established. For 
example, consider a dual-homed client connected to a server with 
three interfaces. In this case, even if the subflows are only 
created by the client, six subflows will be established. This may be 
excessive in some environments, in particular when the client and/or 
the server have a large number of interfaces.  Implementations should 
limit the number of subflows that are used. 


Creating subflows between multihomed clients and servers may 
sometimes lead to operational issues as observed by discussions on 
the mptcp-dev mailing list. In some cases, the network operators 
would like to have a better control on how the subflows are created 
by Multipath TCP [MPTCP-MAX-SUB]. This might require the definition 
of policy rules to control the operation of the subflow manager. The 
two scenarios below illustrate some of these requirements. 


Sa ia tata switch2 -------- + 


Figure 3: Simple Switched Network Topology 


Bonaventure, et al. Informational [Page 13] 


RFC 8041 MPTCP Experience January 2017 


Consider the simple network topology shown in Figure 3. From an 
operational viewpoint, a network operator could want to create two 
subflows between the communicating hosts. From a bandwidth 


utilization viewpoint, the most natural paths are hostl-switchl-host2 
and hostl-switch2-host2. However, a Multipath TCP implementation 
running on these two hosts may sometimes have difficulties to obtain 
this result. 


To understand the difficulty, let us consider different allocation 
strategies for the IP addresses. A first strategy is to assign two 
subnets: subnetA (resp. subnetB) contains the IP addresses of hosti's 
interface to switchl (resp. switch2) and host2's interface to switchl 
(resp. switch2). In this case, a Multipath TCP subflow manager 
should only create one subflow per subnet. To enforce the 
utilization of these paths, the network operator would have to 
Specify a policy that prefers the subflows in the same subnet over 
subflows between addresses in different subnets. It should be noted 
that the policy should probably also specify how the subflow manager 
should react when an interface or subflow fails. 


A second strategy is to use a single subnet for all IP addresses. In 
this case, it becomes more difficult to specify a policy that 
indicates which subflows should be established. 


The second subflow manager that is currently supported by the 
Multipath TCP implementation in the Linux kernel is the ndiffport 
subflow manager. This manager was initially created to exploit the 
path diversity that exists between single-homed hosts due to the 
utilization of flow-based load-balancing techniques [SIGCOMM11]. 

This subflow manager creates N subflows between the same pair of IP 
addresses. The N subflows are created by the client and differ only 
in the source port selected by the client. It was not designed to be 
used on multihomed hosts. 


A more flexible subflow manager has been proposed, implemented and 
evaluated in [CONEXT15]. This subflow manager exposes various kernel 
events to a user space daemon that decides when subflows need to be 
created and terminated based on various policies. 
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3.5. Subflow Destination Port 


The Multipath TCP protocol relies on the token contained in the 

MP JOIN option to associate a subflow to an existing Multipath TCP 
session. This implies that there is no restriction on the source 
address, destination address and source or destination ports used for 
the new subflow. The ability to use different source and destination 
addresses is key to support multihomed servers and clients. The 
ability to use different destination port numbers is worth discussing 
because it has operational implications. 


For illustration, consider a dual-homed client that creates a second 
subflow to reach a single-homed server as illustrated in Figure 4. 


client ------- rl --- internet --- server 


Figure 4: Multihomed-Client Connected to Single-Homed Server 


When the Multipath TCP implementation in the Linux kernel creates the 
second subflow, it uses the same destination port as the initial 
subflow. This choice is motivated by the fact that the server might 
be protected by a firewall and only accept TCP connections (including 
subflows) on the official port number. Using the same destination 
port for all subflows is also useful for operators that rely on the 
port numbers to track application usage in their network. 


There have been suggestions from Multipath TCP users to modify the 
implementation to allow the client to use different destination ports 
to reach the server. This suggestion seems mainly motivated by 
traffic-shaping middleboxes that are used in some wireless networks. 
In networks where different shaping rates are associated with 
different destination port numbers, this could allow Multipath TCP to 
reach a higher performance. This behavior is valid according to the 
Multipath TCP specification [RFC6824]. An application could use an 
enhanced socket API [SOCKET] to behave in this way. 


However, from an implementation point-of-view supporting different 
destination ports for the same Multipath TCP connection can cause 
some issues. A legacy implementation of a TCP stack creates a 
listening socket to react upon incoming SYN segments. The listening 
Socket is handling the SYN segments that are sent on a specific port 
number. Demultiplexing incoming segments can thus be done solely by 
looking at the IP addresses and the port numbers. With Multipath TCP 
however, incoming SYN segments may have an MP JOIN option with a 
different destination port. This means that all incoming segments 
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that did not match on an existing listening-socket or an already 
established socket must be parsed for an eventual MP JOIN option. 
This imposes an additional cost on servers, previously not existent 
on legacy TCP implementations. 


3.6. Closing Subflows 


client server 


MPTCP: ESTABLISHED 
Sub: ESTABLISHED 


MPTCP: ESTABLISHED 
Sub: ESTABLISHED 


DATA FIN 
MPTCP: CLOSE-WAIT 
Sub: ESTABLISHED 


close() (step 1) 


Sub: CLOSED 


| | 
| | 
| | 
| | 
------------------------ > | MPTCP: FIN-WAIT-2 
Sub: ESTABLISHED 
| | 
| DATA FIN + subflow-FIN | 
close () /shutdown () | ------------------------ > | MPTCP: TIME-WAIT 
(step 2) | DATA_ACK | Sub: CLOSE-WAIT 
MPTCP: CLOSED <------------------------ 
Sub: FIN-WAIT-2 
| | 
| subflow-FIN | 
MPTCP: CLOSED | <------------------------ | subflow-close() 
Sub: TIME-WAIT | subflow-ACK 
(step 3) | ------------------------ > | MPTCP: TIME-WAIT 
| | 


Figure 5: Multipath TCP may not be able to avoid time-wait state on 
the subflow (indicated as Sub in the drawing), even if enforced by 
the application on the client-side. 


Figure 5 shows a very particular issue within Multipath TCP. Many 
high-performance applications try to avoid TIME-WAIT state by 
deferring the closure of the connection until the peer has sent a 
FIN. That way, the client on the left of Figure 5 does a passive 
closure of the connection, transitioning from CLOSE-WAIT to Last-ACK 
and finally freeing the resources after reception of the ACK of the 
FIN. An application running on top of an MPTCP-enabled Linux kernel 
might also use this approach. The difference here is that the 
close() of the connection (step 1 in Figure 5) only triggers the 
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sending of a DATA FIN. Nothing guarantees that the kernel is ready 
to combine the DATA FIN with a subflow-FIN. The reception of the 
DATA FIN will make the application trigger the closure of the 
connection (step 2), trying to avoid TIME-WAIT state with this late 
closure. This time, the kernel might decide to combine the DATA FIN 
with a subflow-FIN. This decision will be fatal, as the subflow's 
state machine will not transition from CLOSE WAIT to Last-ACK, but 
rather go through FIN WAIT-2 into TIME-WAIT state. The TIME-WAIT 
state will consume resources on the host for at least 2 MSL (Maximum 
Segment Lifetime). Thus, a smart application that tries to avoid 
TIME-WAIT state by doing late closure of the connection actually ends 
up with one of its subflows in TIME-WAIT state. A high-performance 
Multipath TCP kernel implementation should honor the desire of the 
application to do passive closure of the connection and successfully 
avoid TIME-WAIT state -- even on the subflows. 


The solution to this problem lies in an optimistic assumption that a 
host doing active-closure of a Multipath TCP connection by sending a 
DATA FIN will soon also send a FIN on all its subflows. Thus, the 
passive closer of the connection can simply wait for the peer to send 
exactly this FIN -- enforcing passive closure even on the subflows. 
Of course, to avoid consuming resources indefinitely, a timer must 
limit the time our implementation waits for the FIN. 


3.7. Packet Schedulers 


In a Multipath TCP implementation, the packet scheduler is the 
algorithm that is executed when transmitting each packet to decide on 
which subflow it needs to be transmitted. The packet scheduler 
itself does not have any impact on the interoperability of Multipath 
TCP implementations. However, it may clearly impact the performance 
of Multipath TCP sessions. The Multipath TCP implementation in the 
Linux kernel supports a pluggable architecture for the packet 
Scheduler [PaaschPhD]. As of this writing, two schedulers have been 
implemented: round-robin and lowest-rtt-first. The second scheduler 
relies on the round-trip time (rtt) measured on each TCP subflow and 
sends first segments over the subflow having the lowest round-trip 
time. They are compared in [CSWS14]. The experiments and 
measurements described in [CSWS14] show that the lowest-rtt-first 
Scheduler appears to be the best compromise from a performance 
viewpoint. Another study of the packet schedulers is presented in 
[PAMS2014]. This study relies on simulations with the Multipath TCP 
implementation in the Linux kernel. They compare the lowest-rtt- 
first with the round-robin and a random scheduler. They show some 
Situations where the lowest-rtt-first scheduler does not perform as 
well as the other schedulers, but there are many scenarios where the 
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opposite is true.  [PAMS2014] notes that "it is highly likely that 
the optimal scheduling strategy depends on the characteristics of the 
paths being used." 


3.8. Segment Size Selection 


When an application performs a write/send system call, the kernel 
allocates a packet buffer (sk buff in Linux) to store the data the 
application wants to send. The kernel will store at most one MSS 
(Maximum Segment Size) of data per buffer. As the MSS can differ 
amongst subflows, an MPTCP implementation must select carefully the 
MSS used to generate application data. The Linux kernel 
implementation had various ways of selecting the MSS: minimum or 
maximum amongst the different subflows. However, these heuristics of 
MSS selection can cause significant performance issues in some 
environments. Consider the following example. An MPTCP connection 
has two established subflows that respectively use an MSS of 1420 and 
1428 bytes. If MPTCP selects the maximum, then the application will 
generate segments of 1428 bytes of data. An MPTCP implementation 
will have to split the segment in two (1420-byte and 8-byte) segments 
when pushing on the subflow with the smallest MSS. The latter 
segment will introduce a large overhead as this single data segment 
will use 2 slots in the congestion window (in packets) therefore 
reducing by roughly twice the potential throughput (in bytes/s) of 
this subflow. Taking the smallest MSS does not solve the issue as 
there might be a case where the subflow with the smallest MSS only 
sends a few packets, therefore reducing the potential throughput of 
the other subflows. 


The Linux implementation recently took another approach [DetalMSS]. 
Instead of selecting the minimum and maximum values, it now 
dynamically adapts the MSS based on the contribution of all the 
subflows to the connection's throughput. For each subflow, it 
computes the potential throughput achieved by selecting each MSS 
value and by taking into account the lost space in the congestion 
window. It then selects the MSS that allows to achieve the highest 
potential throughput. 


Given the prevalence of middleboxes that clamp the MSS, Multipath TCP 
implementations must be able to efficiently support subflows with 
different MSS values. The strategy described above is a possible 
solution to this problem. 
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3.9. Interactions with the Domain Name System 


Multihomed clients such as smartphones can send DNS queries over any 
of their interfaces. When a single-homed client performs a DNS 
query, it receives from its local resolver the best answer for its 
request. If the client is multihomed, the answer in response to the 
DNS query may vary with the interface over which it has been sent. 


cdnl 
client -- sedia -- internet -- cdn3 
T----- wifi -------- + 
ak 


Figure 6: Simple Network Topology 


If the client sends a DNS query over the WiFi interface, the answer 
will point to the cdn2 server while the same request sent over the 
cellular interface will point to the cdnl server. This might cause 
problems for CDN providers that locate their servers inside ISP 
networks and have contracts that specify that the CDN server will 
only be accessed from within this particular ISP. Assume now that 
both the client and the CDN servers support Multipath TCP. In this 
case, a Multipath TCP session from cdnl or cdn2 would potentially use 
both the cellular network and the WiFi network. Serving the client 
from cdn2 over the cellular interface could violate the contract 
between the CDN provider and the network operators. A similar 
problem occurs with regular TCP if the client caches DNS replies. 

For example, the client obtains a DNS answer over the cellular 
interface and then stops this interface and starts to use its WiFi 
interface. If the client retrieves data from cdnl over its WiFi 
interface, this may also violate the contract between the CDN and the 
network operators. 


A possible solution to prevent this problem would be to modify the 
DNS resolution on the client. The client subnet Extension Mechanisms 
for DNS (EDNS) defined in [RFC7871] could be used for this purpose. 
When the client sends a DNS query from its WiFi interface, it should 
also send the client subnet corresponding to the cellular interface 
in this request. This would indicate to the resolver that the answer 
should be valid for both the WiFi and the cellular interfaces (e.g., 
the cdn3 server). 
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3.10. Captive Portals 


Multipath TCP enables a host to use different interfaces to reach a 


server. In theory, this should ensure connectivity when at least one 
of the interfaces is active. However, in practice, there are some 
particular scenarios with captive portals that may cause operational 
problems. The reference environment is shown in Figure 7. 
client ----- networkl1 
T------- internet ------------- server 


Figure 7: Issue with Captive Portal 


The client is attached to two networks: networkl that provides 
limited connectivity and the entire Internet through the second 
network interface. In practice, this scenario corresponds to an open 
WiFi network with a captive portal for networkl and a cellular 
service for the second interface. On many smartphones, the WiFi 
interface is preferred over the cellular interface. If the 
smartphone learns a default route via both interfaces, it will 
typically prefer to use the WiFi interface to send its DNS request 
and create the first subflow. This is not optimal with Multipath 
TCP. A better approach would probably be to try a few attempts on 
the WiFi interface and then, upon failure of these attempts, try to 
use the second interface for the initial subflow as well. 


Sud. Stateless Webservers 


MPTCP has been designed to interoperate with webservers that benefit 
from SYN-cookies to protect against SYN-flooding attacks [RFC4987]. 
MPTCP achieves this by echoing the keys negotiated during the 

MP CAPABLE handshake in the third ACK of the three-way handshake. 
Reception of this third ACK then allows the server to reconstruct the 
state specific to MPTCP. 


However, one caveat to this mechanism is the unreliable nature of the 
third ACK. Indeed, when the third ACK gets lost, the server will not 
be able to reconstruct the MPTCP state. MPTCP will fall back to 
regular TCP in this case. This is in contrast to regular TCP. When 
the client starts sending data, the first data segment also includes 
the SYN-cookie, which allows the server to reconstruct the TCP-state. 
Further, this data segment will be retransmitted by the client in 
case it gets lost and thus is resilient against loss. MPTCP does not 
include the keys in this data segment and thus the server cannot 
reconstruct the MPTCP state. 
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This issue might be considered as a minor one for MPTCP. Losing the 

third ACK should only happen when packet loss is high; in this case, 

MPTCP provides a lot of benefits as it can move traffic away from the 
lossy link. It is undesirable that MPTCP has a higher chance to fall 
back to regular TCP in those lossy environments. 


[MPTCP-DEPLOY] discusses this issue and suggests a modified handshake 
mechanism that ensures reliable delivery of the MP CAPABLE, following 
the three-way handshake. This modification will make MPTCP reliable, 
even in lossy environments when servers need to use SYN-cookies to 
protect against SYN-flooding attacks. 


3.12.  Load-Balanced Server Farms 


Large-scale server farms typically deploy thousands of servers behind 
a single virtual IP (VIP). Steering traffic to these servers is done 
through Layer 4 load-balancers that ensure that a TCP-flow will 
always be routed to the same server [Presto08]. 


As Multipath TCP uses multiple different TCP subflows to steer the 
traffic across the different paths, load-balancers need to ensure 
that all these subflows are routed to the same server. This implies 
that the load-balancers need to track the MPTCP-related state, 
allowing them to parse the token in the MP JOIN and assign those 
subflows to the appropriate server. However, server farms typically 
deploy several load-balancers for reliability and capacity reasons. 
As a TCP subflow might get routed to any of these load-balancers, 
they would need to synchronize the MPTCP-related state -- a solution 
that is not feasible on a large scale. 


The token (carried in the MP JOIN) contains the information 
indicating to which MPTCP-session the subflow belongs. As the token 
is a hash of the key, servers are not able to generate the token in 
such a way that the token can provide the necessary information to 
the load-balancers, which would allow them to route TCP subflows to 
the appropriate server. [MPTCP-LOAD] discusses this issue in detail 
and suggests two alternative MP CAPABLE handshakes to overcome these. 


4. Security Considerations 


This informational document discusses use cases and operational 
experience with Multipath TCP. An extensive analysis of the 
remaining security issues in the Multipath TCP specification has been 
published in [RFC7430], together with suggestions for possible 
solutions. 
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From a security viewpoint, it is important to note that Multipath 
TCP, like other multipath solutions such as SCTP, has the ability to 
send packets belonging to a single connection over different paths. 
This design feature of Multipath TCP implies that middleboxes that 
have been deployed on-path assuming that they would observe all the 
packets exchanged for a given connection in both directions may not 
function correctly anymore. A typical example are firewalls, 
Intrusion Detection System (IDS) or deep packet inspections (DPIs) 
deployed in enterprise networks. Those devices expect to observe all 
the packets from all TCP connections. With Multipath TCP, those 
middleboxes may not observe anymore all packets since some of them 
may follow a different path. The two examples below illustrate 
typical deployments of such middleboxes. The first example, 

Figure 8, shows an MPTCP-enabled smartphone attached to both an 
enterprise and a cellular network. If a Multipath TCP connection is 
established by the smartphone towards a server, some of the packets 
sent by the smartphone or the server may be transmitted over the 
cellular network and thus be invisible for the enterprise middlebox. 


smartphone -*----- enterprise net --- MBox-----t------ server 
Haass cellular net? -===Hssss===== + 


Figure 8: Enterprise Middlebox May Not Observe 
All Packets from Multihomed Host 


The second example, Figure 9, shows a possible issue when multiple 
middleboxes are deployed inside a network. For simplicity, we assume 
that networkl is the default IPv4 path while network2 is the default 
IPv6 path. A similar issue could occur with per-flow load-balancing 
such as ECMP [RFC2992]. With regular TCP, all packets from each 
connection would either pass through Mboxl or Mbox2. With Multipath 
TCP, the client can easily establish a subflow over networkl and 
another over network2 and each middlebox would only observe a part of 
the traffic of the end-to-end Multipath TCP connection. 


client ----R-- networkl  --- MBoxl ----- Bnet. ect sc perpe us 
t-- network2  --- MBox2 ----- t 


Figure 9: Interactions between 
Load-Balancing and Security Middleboxes 


In these two cases, it is possible for an attacker to evade some 
Security measures operating on the TCP byte stream and implemented on 
the middleboxes by controlling the bytes that are actually sent over 
each subflow and there are tools that ease those kinds of evasion 
[P215] [PT14]. This is not a security issue for Multipath TCP itself 
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since Multipath TCP behaves correctly. However, this demonstrates 
the difficulty of enforcing security policies by relying only on 
on-path middleboxes instead of enforcing them directly on the 


endpoints. 
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